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The  Myhill-Nerode  Theorem  as  stated  in  [6]  says  that  for  a  set  R  of  strings 
over  a  finite  alphabet  E,  the  following  statements  are  equivalent: 

(i)  R  is  regular 

(ii)  R  is  a  union  of  classes  of  a  right-invariant  equivalence  relation  of  finite 
index 

(iii)  the  relation  =r  is  of  finite  index,  where  x  =r  y  iff  \/z  £  E*  xz  £  R  <->■ 
yz  G  R. 

This  result  generalizes  in  a  straightforward  way  to  automata  on  finite 
trees.  I  rediscovered  this  generalization  in  connection  with  work  on  finitely 
presented  algebras,  and  stated  it  without  proof  or  attribution  in  [7,  8],  being 
at  that  time  under  the  impression  that  it  was  folklore  and  completely  elemen¬ 
tary.  It  was  again  rediscovered  independently  by  Z.  Fiilop  and  S.  Vagvolgyi 
and  reported  in  a  recent  contribution  to  this  Bulletin  [5].  In  that  paper  they 
attribute  the  result  to  me. 

In  fact,  the  result  goes  back  at  least  ten  years  earlier  to  the  late  1960s.  It  is 
difficult  to  attribute  it  to  any  one  paper,  since  it  seems  to  have  been  in  the  air 
at  a  time  when  the  theory  of  finite  automata  on  trees  was  undergoing  intense 
development.  In  a  sense,  it  is  an  inevitable  consequence  Myhill  and  Nerode’s 
work  [9,  10],  since  “conventional  finite  automata  theory  goes  through  for 
the  generalization — and  it  goes  through  quite  neatly”  [11],  The  first  explicit 

*  Bull.  Europ.  Assoc.  Theor.  Comput.  Set.  47  (June  1992),  170-173. 

t  Computer  Science  Department,  Cornell  University,  Ithaca,  New  York  14853,  USA 


1 


mention  of  the  equivalence  of  the  tree  analogs  of  (i)  and  (ii)  seems  to  be 
by  Brainerd  [2,  3]  and  Eilenberg  and  Wright  [4],  although  the  latter  claim 
that  their  paper  “contains  nothing  that  is  essentially  new,  except  perhaps 
for  a  point  of  view”  [4].  A  relation  on  trees  analogous  to  =r  was  defined 
and  clause  (iii)  added  explicitly  by  Arbib  and  Give’on  [1,  Definition  2.13], 
although  it  is  also  essentially  implicit  in  work  of  Brainerd  [2,  3]. 

All  the  cited  papers  from  the  1960s  involve  heavy  use  of  universal  algebra 
and/or  category  theory.  In  these  papers,  a  tree  automaton  is  a  finite  E- 
algebra,  and  the  map  8  (see  below)  is  a  E-algebra  homomorphism.  Although 
exceedingly  elegant,  this  approach  renders  the  result  less  accessible  to  the 
average  computer  science  undergraduate.  Fiilop  and  Vagvolgyi  take  a  some¬ 
what  different  approach,  appealing  to  the  theory  of  term  rewriting  systems, 
tree  transducers,  and  NTS  grammars.  Again,  although  this  approach  reveals 
some  interesting  and  fundamental  connections,  it  is  rather  involved  and  not 
suitable  fare  for  undergraduates. 

In  contrast,  the  proof  I  had  in  mind  when  writing  [7,  8]  is  a  straightfor¬ 
ward  and  comparatively  mundane  generalization  of  [6] .  It  can  be  appreciated 
by  computer  science  undergraduates  familiar  with  the  Myhill-Nerode  The¬ 
orem  but  with  no  knowledge  of  universal  algebra,  category  theory,  or  term 
rewriting  systems. 

My  purposes  in  writing  this  note  are  threefold:  to  set  the  record  straight 
with  respect  to  attribution;  to  apologize  to  Fiilop  and  Vagvolgyi  for  giving 
them  the  impression  that  I  should  be  credited  with  the  result;  and  to  present 
an  elementary  proof  in  the  style  of  [6]. 


Definitions 

Let  E  be  a  finite  ranked  alphabet.  The  rank  of  /  £  E  is  called  its  arity.  The 
set  of  n-ary  elements  of  E  is  denoted  En.  The  set  of  ground  terms  over  E 
is  denoted  T%.  A  congruence  on  Ts  is  an  equivalence  relation  =  such  that 
fsi  .  .  .  sn  =  fti  .  .  .  tn  whenever  /  £  En  and  s;  =  L,  1  <  i  <  n.  A  congruence 
=  is  finitely  generated  if  it  is  generated  by  a  finite  subrelation.  It  is  of  finite 
index  if  there  are  only  finitely  many  =-classes.  It  respects  A  C  Ts  if  A  is  a 
union  of  =-classes. 

A  (deterministic,  bottom-up)  tree  automaton  over  E  is  a  tuple 

M  =  (Q,E,F,h) 
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where  Q  is  a  set  of  states ,  F  C  Q  is  a  set  of  final  states ,  and  h  is  a  transition 
function 


8  :{JZnxQn  Q  . 

n 

In  other  words,  8  takes  an  input  symbol  /  £  E  and  an  n-tuple  of  states 
<71, .  .  .  ,  qn,  where  n  is  the  arity  of  /,  and  produces  a  next  state  h(/,  <71, .  .  . ,  qn)  £ 

Q- 

Tree  automata  over  S  run  on  ground  terms  over  E.  Informally,  an  au¬ 
tomaton  starts  at  the  leaves  and  moves  upward,  associating  a  state  with 
each  subterm  inductively.  If  the  immediate  subterms  A, .  .  . ,  A  of  the  term 
ft  1  .  .  .  tn  are  labeled  with  states  qi , .  .  . ,  qn  respectively,  then  the  term  ft  1  .  .  .  tn 
will  be  labeled  with  state  h(/,  qi} .  .  . ,  qn).  The  term  is  accepted  if  the  state 
labeling  the  root  is  in  F . 

Formally,  define  the  labeling  function  8  :  Ts  — >  Q  inductively  by 

8{ft1...tn)  =  8(f,8(t1), . . .  ,8(tn))  • 

Note  that  the  basis  of  the  induction  is  included  in  this  definition:  8(c)  =  8(c) 
for  c  nullary. 

The  term  t  is  said  to  be  accepted  by  M  if  8(t)  £  F .  The  set  of  terms 
accepted  by  M  is  denoted  C(M).  A  set  of  terms  is  called  regular  if  it  is  C(M) 
for  some  M . 

This  definition  extends  the  usual  definition  of  automata  on  finite  strings 
in  a  natural  way:  we  can  think  of  an  automaton  on  strings  over  a  finite 
alphabet  S  as  a  tree  automaton  over  S  U  {□}  turned  on  its  side,  where  we 
assign  □  arity  0  and  elements  of  S  arity  1. 


The  Myhill-Nerode  Theorem  for  Trees 

For  a  given  ii  C  Ts,  define  s  =r  t  if  for  all  terms  u  with  exactly  one 
occurrence  of  a  variable  x  and  no  other  variables, 

u\x/s\  £  R  u[x/t\  £  R  . 

Theorem  (Myhill-Nerode  Theorem  for  trees)  Let  R  C  Ts.  The  fol¬ 
lowing  are  equivalent: 
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(i)  R  is  regular 

(u)  there  exists  a  finitely  generated  congruence  of  finite  index  respecting  R 

(in)  the  relation  =r  is  of  finite  index. 

Proof,  (i)  — >■  (iii)  Suppose  R  =  C(M)  where  M  =  (Q,'E,F,8).  We  show 
that  if  8(s)  =  S(t)  then  s  =r  t,  thus  there  are  no  more  =-classes  than  states 
of  M.  If  8(s)  =  S(t)  and  u  is  any  term  with  exactly  one  occurrence  of  a 
variable  x,  then  according  to  the  behavior  of  the  machine, 

8(u[xjs\)  =  8(u[x/t\)  . 

This  follows  formally  from  an  easy  inductive  argument  on  the  depth  of  u. 
Thus 

u[x/s\  G  R  8(u[x/s\)  G  F  ^  8(u[x/t])  G  F  ^  u[x/t\  G  R  . 

Since  u  was  arbitrary,  s  =r  t. 

(iii)  — >  (ii)  We  show  that  =r  is  a  finitely  generated  congruence  respecting 
R.  It  is  clearly  an  equivalence  relation.  It  is  also  a  congruence,  since  if  /  is 
n-ary  and  s;  =r  ti,  1  <  i  <  n,  then  for  any  u  with  exactly  one  occurrence  of 
a  variable  x, 

u[x/fs1  .  .  .Si-iSiti+i  ...tn]eR  ^  u[x/fs1  .  .  .Si-iyti+i  .  .  .tfify/sf  G  R 

^  u[x/fs1  .  .  .  Si-iyti+i  .  .  .tn\[y/ti]  G  R 
<  ^  Uj(x  j  f  S\  .  .  .  .s _ iT T 1  .  .  .  tn j  G  R  , 

therefore 

f  S\  .  .  .  Si  —  iSfii^.  1  .  .  .  tn  —R  fs  1  .  .  .  Si  —  itfii+i  .  .  .  tn  , 

and  fs  1  .  .  .  sn  =r  ft  1  .  .  .  tn  follows  from  transitivity.  It  respects  i?,  since  if 
s  =r  t,  then 

s  G  R  x\x / s\  G  R  x[x/t\  G  R  t  G  R  . 

It  is  hnitely  generated,  since  any  congruence  =  of  hnite  index  is:  if  U  C  Tj 
is  a  complete  set  of  representatives  for  the  =-classes,  then  =  is  generated  by 
the  hnite  subrelation  consisting  of  all  equations  in  =  of  the  form 

fui . . . un  =  u 
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for  U\ , .  .  .  ,  un ,  u  £  U  and  /  £  SB.  This  is  because  every  term  is  equivalent 
to  some  u  £  U  in  the  congruence  generated  by  this  subrelation,  as  an  easy 
inductive  argument  shows. 

(ii)  — >  (i)  Let  =  be  the  congruence,  and  let  [t]  denote  the  =-class  of 
t.  Form  an  automaton  M  with  states  Q  =  {[t]  |  t  £  T^},  final  states 
F  =  {[f]  |  t  £  i?},  and  transition  function 

&(f,  [*i],...,[*n])  =  [fh...tn\. 

The  function  8  is  well-dehned,  since  if  [s8]  =  [U],  1  <  i  <  n,  then  [/si  .  .  .  sn]  = 
\fti  .  .  ,tn\.  Moreover,  an  easy  induction  shows  that  8(t)  =  [t]  for  all  t,  thus 

t  £  R  ^ ^  [t]eF  ^  8(t)  eF  ^  te  C(M)  . 


□ 

In  complete  analogy  with  the  case  of  strings,  the  congruences  on  Ts  of 
finite  index  respecting  R  are  in  one-to-one  correspondence  (up  to  isomor¬ 
phism)  with  the  deterministic  bottom-up  finite  tree  automata  with  no  inac¬ 
cessible  states  accepting  i?,  and  there  is  a  unique  minimal  such  automaton 
corresponding  to  the  congruence  =r. 
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